12,233 research outputs found
Controllable Image-to-Video Translation: A Case Study on Facial Expression Generation
The recent advances in deep learning have made it possible to generate
photo-realistic images by using neural networks and even to extrapolate video
frames from an input video clip. In this paper, for the sake of both furthering
this exploration and our own interest in a realistic application, we study
image-to-video translation and particularly focus on the videos of facial
expressions. This problem challenges the deep neural networks by another
temporal dimension comparing to the image-to-image translation. Moreover, its
single input image fails most existing video generation methods that rely on
recurrent models. We propose a user-controllable approach so as to generate
video clips of various lengths from a single face image. The lengths and types
of the expressions are controlled by users. To this end, we design a novel
neural network architecture that can incorporate the user input into its skip
connections and propose several improvements to the adversarial training method
for the neural network. Experiments and user studies verify the effectiveness
of our approach. Especially, we would like to highlight that even for the face
images in the wild (downloaded from the Web and the authors' own photos), our
model can generate high-quality facial expression videos of which about 50\%
are labeled as real by Amazon Mechanical Turk workers.Comment: 10 page
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
In this paper, we propose an Attentional Generative Adversarial Network
(AttnGAN) that allows attention-driven, multi-stage refinement for fine-grained
text-to-image generation. With a novel attentional generative network, the
AttnGAN can synthesize fine-grained details at different subregions of the
image by paying attentions to the relevant words in the natural language
description. In addition, a deep attentional multimodal similarity model is
proposed to compute a fine-grained image-text matching loss for training the
generator. The proposed AttnGAN significantly outperforms the previous state of
the art, boosting the best reported inception score by 14.14% on the CUB
dataset and 170.25% on the more challenging COCO dataset. A detailed analysis
is also performed by visualizing the attention layers of the AttnGAN. It for
the first time shows that the layered attentional GAN is able to automatically
select the condition at the word level for generating different parts of the
image
Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation
We propose a hierarchically structured reinforcement learning approach to
address the challenges of planning for generating coherent multi-sentence
stories for the visual storytelling task. Within our framework, the task of
generating a story given a sequence of images is divided across a two-level
hierarchical decoder. The high-level decoder constructs a plan by generating a
semantic concept (i.e., topic) for each image in sequence. The low-level
decoder generates a sentence for each image using a semantic compositional
network, which effectively grounds the sentence generation conditioned on the
topic. The two decoders are jointly trained end-to-end using reinforcement
learning. We evaluate our model on the visual storytelling (VIST) dataset.
Empirical results from both automatic and human evaluations demonstrate that
the proposed hierarchically structured reinforced training achieves
significantly better performance compared to a strong flat deep reinforcement
learning baseline.Comment: Accepted to AAAI 201
Semileptonic Meson Decays Into A Highly Excited Charmed Meson Doublet
We study the heavy quark effective theory prediction for semileptonic
decays into an orbital excited -wave charmed doublet, the (, )
states (, ), at the leading order of heavy quark expansion.
The corresponding universal form factor is estimated by using the QCD sum rule
method. The decay rates we predict are and . The branching ratios are
and
, respectively.Comment: 6 pages,2 figure
Distributed Multicell Beamforming Design Approaching Pareto Boundary with Max-Min Fairness
This paper addresses coordinated downlink beamforming optimization in
multicell time-division duplex (TDD) systems where a small number of parameters
are exchanged between cells but with no data sharing. With the goal to reach
the point on the Pareto boundary with max-min rate fairness, we first develop a
two-step centralized optimization algorithm to design the joint beamforming
vectors. This algorithm can achieve a further sum-rate improvement over the
max-min optimal performance, and is shown to guarantee max-min Pareto
optimality for scenarios with two base stations (BSs) each serving a single
user. To realize a distributed solution with limited intercell communication,
we then propose an iterative algorithm by exploiting an approximate
uplink-downlink duality, in which only a small number of positive scalars are
shared between cells in each iteration. Simulation results show that the
proposed distributed solution achieves a fairness rate performance close to the
centralized algorithm while it has a better sum-rate performance, and
demonstrates a better tradeoff between sum-rate and fairness than the Nash
Bargaining solution especially at high signal-to-noise ratio.Comment: 8 figures. To Appear in IEEE Trans. Wireless Communications, 201
Unconventional Superconducting Symmetry in a Checkerboard Antiferromagnet
We use a renormalized mean field theory to study the Gutzwiller projected BCS
states of the extended Hubbard model in the large limit, or the
--- model on a two-dimensional checkerboard lattice. At small
, the frustration due to the diagonal terms of and does not
alter the -wave pairing symmetry, and the negative (positive)
enhances (suppresses) the pairing order parameter. At large , the
ground state has an extended s-wave symmetry. At the intermediate , the
ground state is or -wave with time reversal symmetry broken.Comment: 6 pages, 6 figure
Shallow Triple Stream Three-dimensional CNN (STSTNet) for Micro-expression Recognition
In the recent year, state-of-the-art for facial micro-expression recognition
have been significantly advanced by deep neural networks. The robustness of
deep learning has yielded promising performance beyond that of traditional
handcrafted approaches. Most works in literature emphasized on increasing the
depth of networks and employing highly complex objective functions to learn
more features. In this paper, we design a Shallow Triple Stream
Three-dimensional CNN (STSTNet) that is computationally light whilst capable of
extracting discriminative high level features and details of micro-expressions.
The network learns from three optical flow features (i.e., optical strain,
horizontal and vertical optical flow fields) computed based on the onset and
apex frames of each video. Our experimental results demonstrate the
effectiveness of the proposed STSTNet, which obtained an unweighted average
recall rate of 0.7605 and unweighted F1-score of 0.7353 on the composite
database consisting of 442 samples from the SMIC, CASME II and SAMM databases.Comment: 5 pages, 1 figure, Accepted and published in IEEE FG 201
Bulge formation from SSCs in a responding cuspy dark matter halo
We simulate the bulge formation in very late-type dwarf galaxies from
circumnuclear super star clusters (SSCs) moving in a responding cuspy dark
matter halo (DMH). The simulations show that (1) the response of DMH to sinking
of SSCs is detectable only in the region interior to about 200 pc. The mean
logarithmic slope of the responding DM density profile over that area displays
two different phases: the very early descent followed by ascent till
approaching to 1.2 at the age of 2 Gyrs. (2) the detectable feedbacks of the
DMH response on the bulge formation turned out to be very small, in the sense
that the formed bulges and their paired nuclear cusps in the fixed and the
responding DMH are basically the same, both are consistent with
observations. (3) the yielded mass correlation of bulges to their nuclear
(stellar) cusps and the time evolution of cusps' mass are accordance with
recent findings on relevant relations. In combination with the consistent
effective radii of nuclear cusps with observed quantities of nuclear clusters,
we believe that the bulge formation scenario that we proposed could be a very
promising mechanism to form nuclear clusters.Comment: 27 pages, 11 figures, accepted for publication in Ap
- …